chore: Added cryptography section to yellow paper by zac-williamson · Pull Request #3647 · AztecProtocol/aztec-packages

zac-williamson · 2023-12-11T16:02:07Z

Please provide a paragraph or two giving a summary of the change, including relevant motivation and context.

Checklist:

Remove the checklist to signal you've completed it. Enable auto-merge if the PR is ready to merge.

If the pull request requires a cryptography review (e.g. cryptographic algorithm implementations) I have added the 'crypto' tag.
I have reviewed my diff in github, line by line and removed unexpected formatting changes, testing logs, or commented-out code.
Every change is related to the PR description.
I have linked this pull request to relevant issues (if any exist).

maramihali

nice work, i did a first pass and left a few comments/suggestions :)

maramihali · 2023-12-11T16:20:35Z

+
+## Honk
+
+Honk is a variant of the PLONK protocol. Plonk performs polynomial testing via evaluating a polynomial relation is zero modulo the vanishing polynomial of a multiplicative subgroup. Honk performs the polynomial testing via evaluating, using a sumcheck protocol, that a relation over multilinear polynomials vanishes when summed over a boolean hypercube.


by checking instead of via evaluating seems more readable to me

maramihali · 2023-12-11T16:21:41Z

+
+# Incrementally Verifiable Computation Subprotocols
+
+An Incrementally Verifiable Computation (IVC) scheme describes a protocol that enables multiple successive proofs to evolve the value taken by some defined persistent state over time.


"multiple successibe proofs to evolve the value" reads a bit weird to me

changed to something that might make more sense.

maramihali · 2023-12-11T16:23:19Z

+
+Rollup-side, each "step" in the IVC scheme is a Honk proof, which are recursively verified. As a result, no protoocols other than Honk are required to execute rollup-side IVC.
+
+We perform one layer of "proof-system compression" in the rollup. The final proof of block-correctness is constructed as a Honk proof. An UltraPlonk circuit is used to verify the correctness of the Honk proof, so that the proof that is verified on-chain is an UltraPlonk proof (verification gas costs are lower for UltraPlonk vs Honk).


(verification gas costs are lower for UltraPlonk vs Honk) - I would state briefly why this is the case

maramihali · 2023-12-11T16:23:58Z

+The following sections list the protocol components required to implement client-side IVC.
+
+## Protogalaxy
+


I think this section should begin with defining what a folding scheme is

maramihali · 2023-12-11T16:24:49Z

+
+## Protogalaxy
+
+The [Protogalaxy](https://eprint.iacr.org/2023/1106) protocol defines a folding scheme that enables instances of a relation to be folded into a single instance of a "relaxed" form of the original relation.


Suggested change

The [Protogalaxy](https://eprint.iacr.org/2023/1106) protocol defines a folding scheme that enables instances of a relation to be folded into a single instance of a "relaxed" form of the original relation.

The [Protogalaxy](https://eprint.iacr.org/2023/1106) protocol defines a folding scheme that enables instances of a relation to be folded into a single instance of the original relation, but in a "relaxed" form.

maramihali · 2023-12-11T16:30:02Z

+
+#### Elliptic Curve Virtual Machine (ECCVM) Subprotocol
+
+The ECCVM is a Honk circuit with a custom circuit arithmetisation, designed to optimally evaluate elliptic curve arithmetic computations that have been deferred. It is defined over the Grumpkin elliptic curve


Suggested change

The ECCVM is a Honk circuit with a custom circuit arithmetisation, designed to optimally evaluate elliptic curve arithmetic computations that have been deferred. It is defined over the Grumpkin elliptic curve

The ECCVM is a Honk circuit with a custom circuit arithmetisation, designed to optimally evaluate elliptic curve arithmetic computations that have been deferred. It is defined over the Grumpkin elliptic curve.

maramihali · 2023-12-11T16:30:31Z

+
+#### Translator Subprotocol
+
+The Translator is a Honk circuit with a custom circuit arithmetisation, designed to validate the input commitments of an ECCVM circuit align with the delegated computations described by a Goblin Plonk transcript commitment


Suggested change

The Translator is a Honk circuit with a custom circuit arithmetisation, designed to validate the input commitments of an ECCVM circuit align with the delegated computations described by a Goblin Plonk transcript commitment

The Translator is a Honk circuit with a custom circuit arithmetisation, designed to validate that the input commitments of an ECCVM circuit align with the delegated computations described by a Goblin Plonk transcript commitment.

maramihali · 2023-12-11T16:31:23Z

+
+## Plonk Data Bus
+
+The [Plonk Data Bus](https://aztecprotocol.slack.com/files/U8Q1VAX6Y/F05G2B971FY/plonk_bus.pdf) protocol enables efficient data transfer between two Honk instances within a larger IVC protocol.


A sentence about why this is needed would be good

maramihali · 2023-12-11T16:32:32Z

+
+# Polynomial Commitment Schemes
+
+The UltraPlonk, Honk, Goblin Plonk and Plonk Data Bus protocols utilize Polynomial Interactive Oracle Proofs as a core component, neccessitating the use of polynomial commitment schemes (PCX).


Suggested change

The UltraPlonk, Honk, Goblin Plonk and Plonk Data Bus protocols utilize Polynomial Interactive Oracle Proofs as a core component, neccessitating the use of polynomial commitment schemes (PCX).

The UltraPlonk, Honk, Goblin Plonk and Plonk Data Bus protocols utilize Polynomial Interactive Oracle Proofs as a core component, thus requiring the use of polynomial commitment schemes (PCS).

maramihali · 2023-12-11T16:33:46Z

+
+## Inner Product Argument
+
+The [IPA](https://eprint.iacr.org/2019/1177.pdf) PCS has worse asymptotics than KZG but can be instantiated over non-pairing friendly curves.


state this is needed for Grumpkin, if we are at it maybe we should also say we use cycle of curves somewhere?

iAmMichaelConnor · 2023-12-12T13:16:30Z

+
+An Ethereum block consists of approximately 1,000 transactions, with a block gas limit of roughly 10 million gas. Basic computational steps in the Ethereum Virtual Machine consume 3 gas. If the entire block gas limit is consumed with basic computation steps (not true but let's assume for a moment), this implies that 1,000 transactions consume 3.33 million computation steps. i.e. 10 transactions per second would require roughly 33,000 steps per second and 3,330 steps per transaction.
+
+An AVM circuit with 1 million steps can therefore accomodate approximately 300 transactions. Proof construction time must therefore be approximately 30 seconds to be able to prove all AVM programs in a block and achieve 10 tps.


Do I take this to mean we're now aiming for a single VM circuit to execute multiple app functions (rather than a VM circuit per app function)?

AztecBot · 2023-12-18T13:24:28Z

Benchmark results

Metrics with a significant change:

circuit_simulation_time_in_ms (base-rollup): 2,960 (+43%)
circuit_input_size_in_bytes (base-rollup): 667,692 (+173%)
node_history_sync_time_in_ms (5): 24,869 (+58%)
node_history_sync_time_in_ms (10): 47,726 (+61%)
node_database_size_in_bytes (10): 5,832,592 (+37%)
note_history_successful_decrypting_time_in_ms (5): 2,182 (-20%)
note_history_successful_decrypting_time_in_ms (10): 4,240 (-17%)
note_history_trial_decrypting_time_in_ms (5): 170 (+109%)
l2_block_building_time_in_ms (8): 20,117 (+47%)
l2_block_building_time_in_ms (32): 80,409 (+49%)
l2_block_building_time_in_ms (128): 322,508 (+48%)
l2_block_rollup_simulation_time_in_ms (8): 16,592 (+65%)
l2_block_rollup_simulation_time_in_ms (32): 66,489 (+67%)
l2_block_rollup_simulation_time_in_ms (128): 266,465 (+65%)
l2_block_processing_time_in_ms (8): 2,210 (+66%)
l2_block_processing_time_in_ms (32): 8,472 (+67%)
l2_block_processing_time_in_ms (128): 35,167 (+66%)
note_successful_decrypting_time_in_ms (32): 926 (-28%)

Detailed results

All benchmarks are run on txs on the Benchmarking contract on the repository. Each tx consists of a batch call to create_note and increment_balance, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write.

This benchmark source data is available in JSON format on S3 here.

Values are compared against data from master at commit 4a1c0df7 and shown if the difference exceeds 1%.

L2 block published to L1

Each column represents the number of txs on an L2 block published to L1.

Metric	8 txs	32 txs	128 txs
l1_rollup_calldata_size_in_bytes	45,444	179,588	716,132
l1_rollup_calldata_gas	222,780	868,268	3,449,552
l1_rollup_execution_gas	841,867	3,595,376	22,204,921
l2_block_processing_time_in_ms	⚠️ 2,210 (+66%)	⚠️ 8,472 (+67%)	⚠️ 35,167 (+66%)
note_successful_decrypting_time_in_ms	326 (-7%)	⚠️ 926 (-28%)	3,384 (-13%)
note_trial_decrypting_time_in_ms	33.4 (-45%)	52.1 (+32%)	202 (+44%)
l2_block_building_time_in_ms	⚠️ 20,117 (+47%)	⚠️ 80,409 (+49%)	⚠️ 322,508 (+48%)
l2_block_rollup_simulation_time_in_ms	⚠️ 16,592 (+65%)	⚠️ 66,489 (+67%)	⚠️ 266,465 (+65%)
l2_block_public_tx_process_time_in_ms	3,493 (-1%)	13,851 (-1%)	55,800 (-1%)

L2 chain processing

Each column represents the number of blocks on the L2 chain where each block has 16 txs.

Metric	5 blocks	10 blocks
node_history_sync_time_in_ms	⚠️ 24,869 (+58%)	⚠️ 47,726 (+61%)
note_history_successful_decrypting_time_in_ms	⚠️ 2,182 (-20%)	⚠️ 4,240 (-17%)
note_history_trial_decrypting_time_in_ms	⚠️ 170 (+109%)	233 (+15%)
node_database_size_in_bytes	3,625,310 (-8%)	⚠️ 5,832,592 (+37%)
pxe_database_size_in_bytes	29,748 (-1%)	59,307

Circuits stats

Stats on running time and I/O sizes collected for every circuit run across all benchmarks.

Circuit	circuit_simulation_time_in_ms	circuit_input_size_in_bytes	circuit_output_size_in_bytes
private-kernel-init	199	43,109	20,441
private-kernel-ordering	115	25,833	9,689
base-rollup	⚠️ 2,960 (+43%)	⚠️ 667,692 (+173%)	873 (-1%)
root-rollup	89.9 (+2%)	4,072	881 (-1%)
private-kernel-inner	260 (-1%)	64,516	20,441
public-kernel-private-input	172	25,203	20,441
public-kernel-non-first-iteration	168 (-2%)	25,245	20,441
merge-rollup	10.9 (+10%)	2,592 (-1%)	873 (-1%)

Miscellaneous

Transaction sizes based on how many contracts are deployed in the tx.

Metric	0 deployed contracts	1 deployed contracts
tx_size_in_bytes	10,323	25,938 (-1%)

joeandrews · 2023-12-18T13:48:28Z

+| 2x2 rollup proving time | 1 2x2 rollup proof | 7.4 seconds | 0.74 seconds |
+| 2x2 rollup memory consumption | 1 2x2 rollup proof | 128gb | 16gb |
+
+To come up with the above estimates, we are targetting 10 transactions per second for the MVP and 100 tps for the "ideal" case. We are assuming both block producers and rollup Provers have access to 128-core machines with 128gb of RAM. Additionally, we assume that the various process required to produce a block consume the following: 


@zac-williamson currently we have been assuming 32 -64 cores max as the perfomance benefit drops off after that.

See spreadsheet here: https://docs.google.com/spreadsheets/d/1cBBZZ_dyD0tiUmAdjoGbnLrk2H0oSmOgYubtZy9JTRU/edit#gid=1562975724

Secondly, ideally block producers (sequencers) can be run on 16 core 32gb machines as they will not be producing proofs.

codygunton · 2023-12-18T15:01:15Z

+
+The first protocol to combine Plonk and the sumcheck protocol was [HyperPlonk](https://eprint.iacr.org/2022/1355)
+
+Honk uses a custom arithmetisation that extends the Ultra circuit arithmetisation (not yet finalized)


I'd prob add something like "(e.g., it has special relations to efficiently prove Poseidon2 hashing)" but ofc this is not necessary.

added mention of Poseidon2

codygunton · 2023-12-18T15:02:25Z

+* Memory required to generate a user transaction proof
+* Time to generate an Aztec Virtual Machine proof
+* Memory required to generate an Aztec Virtual Machine proof
+* Time to compute a 2x2 rollup proof


I know this "two-by-two" terminology has taken hold, but it's just wrong--IMO a two-by-two thing involves four things of the same type. The term does not indicate any kind of compression.

Motion to change to "2x1" or "2-to-1" 🙏🙏

codygunton · 2023-12-18T15:07:25Z

+
+Rollup-side, each "step" in the IVC scheme is a Honk proof, which are recursively verified. As a result, no protoocols other than Honk are required to execute rollup-side IVC.
+
+We perform one layer of "proof-system compression" in the rollup. The final proof of block-correctness is constructed as a Honk proof. An UltraPlonk circuit is used to verify the correctness of the Honk proof, so that the proof that is verified on-chain is an UltraPlonk proof.


You could link to this lovely explanation written by me and Luke

codygunton · 2023-12-18T15:10:23Z

+
+#### Translator Subprotocol
+
+The Translator is a Honk circuit with a custom circuit arithmetisation, designed to validate that the input commitments of an ECCVM circuit align with the delegated computations described by a Goblin Plonk transcript commitment.  


Since you specified the ECCVM is defined over Grumpkin, maybe say here that the Translator is defined over BN254?

codygunton · 2023-12-18T15:20:32Z

+
+## Plonk Data Bus
+
+When passing data between successive IVC steps, the canonical method is to do so via public inputs. This adds significant costs to an IVC folding verifier (or recursive verifier when not using a folding scheme). Public inputs for part of the proof and therefore must be hashed prior to generating Fiat-Shamir challenges. When this is performed in-circuit, this adds a cost linear in the number of public inputs (with unpleasant constants ~30 constraints per field element).


Can we change "Public inputs for part of the proof and therefore must be hashed prior to generating Fiat-Shamir challenges." to "Public inputs must be hashed prior to generating Fiat-Shamir challenges."?

codygunton · 2023-12-18T15:26:43Z

+
+ "MVP" = minimum standards that we can go to main-net with.
+
+Note: gb = gigabytes (not gigabits, gigibits or gigibytes)


"gigifoo" $\leadsto$ "gibifoo" unless you're trolling

I thought giga = 2^30 and gigi = 10^9 ?

codygunton · 2023-12-18T17:33:06Z

+
+This sets the proof size limit to 819.2 kb per second per 100 transactions => 82 kilobytes of data per transaction.
+
+As a rough estimate, we can assume the non-proof tx data will be irrelevant compared to 82kb, so we target a proof size of $80$ kilobytes for the MPV.


MPV $\leadsto$ MVP

codygunton · 2023-12-18T17:33:24Z

+
+As a rough estimate, we can assume the non-proof tx data will be irrelevant compared to 82kb, so we target a proof size of $80$ kilobytes for the MPV.
+
+To support 100 transactions per second we would rquire a proof size of $8$ kilobytes.


rquire $\leadsto$ require

codygunton · 2023-12-18T17:33:55Z

+
+The critical UX factor. To measure prover time for a transaction, we must first define a baseline transaction we wish to measure and the execution environment of the Prover.
+
+As we build+refine our MPV, we want to avoid optimising the best-case scenario (i.e. the most basic tx type, a token transfer). Instead we want to ensure that transactions of a "moderate" complexity are possible with consuer hardware.


MPV $\leadsto$ MVP

codygunton · 2023-12-18T17:38:53Z

+
+Note: this excludes network coordination costs, latency costs, block construction costs, public VM proof construction costs (must be computed before the 2x2 rollup proofs), cost to compute the final UltraPlonk proof.
+
+To accomodate the above costs, we assume can budget 40% of block production time towards making proofs. Given these constraints, the following table describes maximum allowable proof construction times for a selection of block sizes.


"we assume can budget" $\leadsto$ "we assume that we can budget"

zac-williamson added 3 commits December 11, 2023 16:01

initial commit

81681fd

protogalaxy tweak

3f3fc53

added performance targets

f2fdfae

maramihali reviewed Dec 11, 2023

View reviewed changes